What are Policy Gradient Methods?

Policy gradient methods are a subclass of policy-based methods. Watch the video below to learn more!

M3L3 C01 V3

In the Introduction to Policy-Based Methods lesson, you learned about many policy-based methods that could approximate either a deterministic or stochastic policy.

In this lesson, we'll confine our attention to stochastic policies.

## Quiz

SOLUTION:

Use a softmax activation function in the output layer. This will ensure the network outputs probabilities. For each state input, sample an action from the output probability distribution.

SOLUTION:

Policy gradient methods are a subclass of policy-based methods.
Not all policy-based methods are policy gradient methods.
Both policy-based methods and policy gradient methods directly try to optimize for the optimal policy, without maintaining value function estimates.